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We study the complexity of basic regular operations on languages represented by incomplete deter- 
ministic or nondeterministic automata, in which all states are final. Such languages are known to 
be prefix-closed. We get tight bounds on both incomplete and nondeterministic state complexity of 
complement, intersection, union, concatenation, star, and reversal on prefix-closed languages. 


1 Introduction 


A language L is prefix-closed if w € L implies that every prefix of w is in L. It is known that a regular 
language is prefix-closed if and only if it is accepted by a nondeterministic finite automaton (NFA) with 
all states final [18]. In the minimal incomplete deterministic finite automaton (DFA) for a prefix-closed 
language, all the states are final as well. 

The authors of examined several questions concerning NFAs with all states final. They proved 
that the inequivalence problem for NFAs with all states final is PSPACE-complete in the binary case, 
but polynomially solvable in the unary case. Next, they showed that minimizing a binary NFA with all 
states final is PSPACE-hard, and that deciding whether a given NFA accepts a language that is not prefix- 
closed is PSPACE-complete, while the same problem for DFAs can be solved in polynomial time. The 
NFA-to-DFA conversion and complementation of NFAs with all states final have been also considered 
in [18], and the tight bound 2” for the first problem, and the lower bound 2”~! for the second one have 
been obtained. 

The quotient complexity of prefix-closed languages has been studied in [5]. The quotient of a lan- 
guage L by the string w is the set Ly = {x | wx € L}. The quotient complexity of a language L, K(L), 
is the number of distinct quotients of L. Quotient complexity is defined for any language, and it is fi- 
nite if and only if the language is regular. The quotient automaton of a regular language L is the DFA 
({Ly |w E€ E*}, E,- Le, F), where Ly -a = Lwa, and a quotient Lẹ is final if it contains the empty string. 
The quotient automaton of L is a minimal complete DFA for L, so quotient complexity is the same as the 
state complexity of L which is defined as the number of states in the minimal DFA for L. In [5], the tight 
bounds on the quotient complexity of basic regular operation have been obtained, and to prove upper 
bounds, the properties of quotients have been used rather than automata constructions. 
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Automata with all states final represent systems, for example, production lines, and their intersection 
or parallel composition represents the composition of these systems [21]. A question that arises here 
is, whether the complexity of intersection of automata with all states final is the same as in the general 
case of arbitrary DFAs or NFAs. At the first glance, it seems that this complexity could be smaller. 
Our first result shows that this is not the case. We show that both incomplete and nondeterministic state 
complexity of intersection on prefix-closed languages is given by the function mn, which is the same as 
in the general case of regular languages. 

In the deterministic case, to have all the states final, we have to consider incomplete deterministic 
automata because otherwise, the complete automaton with all states final would accept the language 
consisting of all the strings over an input alphabet. Notice that the model of incomplete deterministic 
automata has been considered already by Maslov [20]. The same model has been used in the study of 
the complexity of the shuffle operation [6]; here, the complexity on complete DFAs is not known yet. 

We next study the complexity of complement, union, concatenation, square, star, and reversal on 
languages represented by incomplete DFAs or NFAs with all states final. We get tight bounds in both 
nondeterministic and incomplete deterministic cases. In the nondeterministic case, all the bounds are the 
same as in the general case of regular languages, except for the bound for star that is n instead of n+ 1. 
However, to prove the tightness of these bounds, we usually use larger alphabets than in the general case 
of regular languages where all the upper bounds can be met by binary languages [10}{12]. 

To get lower bounds, we use a fooling-set lower-bound method [1] 2] /3}|8] [11]. In the case of union 
and reversal, the method does not work since it provides a lower bound on the size of NFAs with multiple 
initial states. Since the nondeterministic state complexity of a regular language is defined using a model 
of NFAs with a single initial state [10], we have to use a modified fooling-set technique to get the tight 
bounds m+n + 1 and n + 1 for union and reversal, respectively. 

In the case of incomplete deterministic finite automata, the tight bounds for complement, union, 
concatenation, star, and reversal aren +1,mn+m-+n,m- el 42" | OF! and 2” — 1, respectively. 
To define worst-case examples, we use a binary alphabet for union, star, and reversal, and a ternary 
alphabet for concatenation. 

The paper is organized as follows. In the next section, we give some basic definitions and preliminary 
results. In Sections B]and [4] we study boolean operations. Concatenation is discussed in Section [5] and 
star and reversal in Section[6] The last section contains some concluding remarks. 


2 Preliminaries 


In this section, we recall some basic definitions and preliminary results. For details and all unexplained 
notions, the reader may refer to [24]. 

A nondeterministic finite automaton (NFA) is a quintuple A = (Q,2,6,/,F), where Q is a finite set 
of states, È is a finite alphabet, 6: Q x L > 22 is the transition function which is extended to the domain 
22 x * in the natural way, I C Q is the set of initial states, and F C Q is the set of final states. The 
language accepted by A is the set L(A) = {w € X* | 6U,w) QF FO}. 

The nondeterministic state complexity of a regular language L, nsc(L), is the smallest number of 
states in any NFA with a single initial state recognizing L. 

An NFA A is incomplete deterministic (DFA) if |I| = 1 and |6(q,a)| < 1 for each q in Q and each 
ain È. In such a case, we write ô(q,a) = q' instead of 5(q,a) = {q'}. A non-final state q of a DFA is 
called a dead state if 6(q,a) = q for each symbol a in È. 
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The incomplete state complexity of a regular language L, isc(L), is the smallest number of states in 
any incomplete DFA recognizing L. An incomplete DFA is minimal (with respect to the number of states) 
if it does not have any dead state, all its states are reachable, and no two distinct states are equivalent. 

Every NFA A = (Q,2,6,/,F) can be converted to an equivalent DFA A’ = (22,¥,-,/,F’), where 
R-a=6(R,a) and F' = {R € 2? | ROF 40}. The DFA A’ is called the subset automaton of the NFA A. 
The subset automaton need not be minimal since some of its states may be unreachable or equivalent. 
However, if for each state g of an NFA A, there exists a string w, that is accepted by A only from the 
state q, then the subset automaton of the NFA A does not have equivalent states since if two subsets of 
the subset automaton differ in a state q, then they are distinguishable by wg. 

To prove the minimality of NFAs, we use a fooling set lower-bound technique, see BITI]. 


Definition A set of pairs of strings { (x1, y1), (x2,y2),---;(%n,n)} is called a fooling set for a language L 
if for all i, j in {1,2,...,n}, the following two conditions hold: 

(F1) x;y; € L, and 

(F2) if i Æ j, then Xiyj ¢ L OF xjyi ¢ L. 

It is well known that the size of a fooling set for a regular language provides a lower bound on the 
number of states in any NFA (with multiple initial states) for the language. The argument is simple. Fix 
the accepting computations of any NFA on strings x;y; and x;y;. Then, the states on these computations 
reached after reading x; and x; must be distinct, otherwise the NFA accepts both x;y; and x;y; for two 
distinct pairs. Hence we get the following observation. 

Lemma 1 (BIBI). Let F be a fooling set for a language L. Then every NFA (with multiple initial 
states) for the language L has at least |F| states. o 

The next lemma shows that sometimes, if we insist on having a single initial state in an NFA, one 
more state is necessary. It can be used in the case of union, reversal, cyclic shift [5], and AFA-to-NFA 
conversion [13]. In each of these cases, NFAs with a single initial state require one more state than NFAs 
with multiple initial states. For the sake of completeness, we recall the proof of the lemma here. 
Lemma 2 ([14]). Let £ and Z be sets of pairs of strings and let u and v be two strings such that AU B, 
A O{(e,u)}, and BU{(E,v)} are fooling sets for a language L. Then every NFA with a single initial 
state for the language L has at least |% | + |B| + 1 states. 


Proof. Consider an NFA for a language L, and let </ = {(x;,yi) |i=1,2,...,m}and B= {(xm+j,¥m+j) | 
j=1,2,...,n}. Since the strings x,y, are in L, we fix an accepting computation of the NFA on each string 
Xxx. Let px be the state on this computation that is reached after reading xg. Since & U Bis a fooling set 
for L, the states p1, p2,..., Pm+n are pairwise distinct. Since & U {(€,u)} is a fooling set, the initial state 
is distinct from all the states p1, p2,..., Pm. Since AU{(E,v)} is a fooling set, the (single) initial state 
is also distinct from all the states Pm+1, Pm+2, - - -> DPm+tn- Thus the NFA has at least m+n+1 states. O 


Example Let K = (a*)* and L = (b°)*. Then nsc(K) = 3 and nsc(L) = 3, and the language K UL is 
accepted by a 6-state NFA with two initial states. Therefore, we cannot expect that we will be able to 
find a fooling set for K UL of size 7. However, every NFA with a single initial state for the language 
KUL requires at least 7 states since Lemmal2]is satisfied for the language K UL with 


B= {(a,a”),(a”,a),(a°,a*)}, 
B ={(b,b"),(b’,b), (b°,b)}, 
u= b°, and 


v=a. 
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If w = uv for strings u and v, then u is a prefix of w. A language L is prefix-closed if w € L implies 
that every prefix of w is in L. The following observations are easy to prove. 


Proposition 3 ({18]). A regular language is prefix-closed if and only if it is accepted by some NFA with 
all states final. o 


Proposition 4. Let A be a minimal incomplete DFA for a language L. Then the language L is prefix- 
closed if and only if all the states of the DFA A are final. O 


3 Complementation 


If L is a language over an alphabet X, then the complement of L is the language L° = &* \ L. If L is 
accepted by a minimal complete DFA A, then we can get a minimal DFA for L° from the DFA A by 
interchanging the final and non-final states. In the case of incomplete DFAs, we first have to add a dead 
state, that is, a non-final state which goes to itself on each input, and let all the undefined transitions go 
to the dead state. After that, we can interchange the final and non-final states to get a (complete) DFA 
for the complement. This gives the following result. 


Theorem 5. Let n> 1. Let L be a prefix-closed regular language over an alphabet X with isc(L) = n. 
Then isc(L°) < n+ 1, and the bound is tight if |X| > 1. 


Proof. For tightness, we can consider the unary prefix-closed language {a |0 <i<n—1}. O 


If a language L is represented by an n-state NFA, then we first construct the corresponding subset 
automaton, and then interchange the final and non-final states to get a DFA for the language L° of at 
most 2” states. This upper bound on the nondeterministic state complexity of complement on regular 
languages is know to be tight in the binary case [12]. 

For prefix-closed languages, we get the same bound, however, to prove tightness, we use a ternary 
alphabet. Whether or not the bound 2” can be met by a binary language remains open. 


Theorem 6. Letn > 2. Let L be a prefix-closed regular language over an alphabet X with nsc(L) = n. 
Then nsc(L°) < 2", and the bound is tight if |X| > 3. 


Proof. The upper bound is the same as in the general case of regular languages [10]. To prove tightness, 
consider the language L accepted by the NFA N shown in Figure [I] in which state n goes to the empty 
set on both a and b, and to {1} on c. Each other state i goes to {i+ 1} on both a and c, and to {1,i+ 1} 
on b. Our aim is to describe a fooling set F = {(xs,ys) | S C {1,2,...,n}} of size 2” for L°. 

First, let us show that each subset of {1,2,...,2} is reachable in the subset automaton of the NFA N. 
The initial state is {1}, and each singleton set {i} is reached from {1} by a’~!. The empty set is reached 
from {n} by a. The set {i}, i2,...,ig} of size k, where 2 < k < n and 1 < i} < i2 < --- < ip <n, is reached 
from the set {iz — i1, ... ig —i, } of size k — 1 by the string ba'™'~!. This proves reachability by induction. 
Now, define xs as the string, by which the initial state 1 of the NFA N goes to the set S. 


om b 
b 
N o= Orn abc Or: 
c 


Figure 1: The NFA N of a prefix-closed language L with nsc(L°) = 2”. 
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Next, for a subset S of {1,2,...,n}, define the string ys as the string ys = yoy1 ---yn_1 of length n, 


where 
a, ifn—iecS, 
yi = ; 
c, ifn—i¢s. 
We claim that the string ys is rejected by the NFA N from each state in S and accepted from each state 
that is not in S. Indeed, if i is a state in S, then y,_; = a and ys = uav with u = yoy, ---yn—i-1 and 


V = Yn—i+1Yn—i+2°**Yn—1. Hence |u| = n — i, which means that the state i goes to {n} by u since both a 
and c move each state q to state q + 1. However, in state n the NFA N cannot read a, and therefore the 


string ys = uav is rejected from i. On the other hand, if i ¢ S, then y,_; = c, and the string ys = ucv with 
|u| = n — i and |v| = i — 1 is accepted from i through the compataon i>nS1i. 
Now, we are ready to prove that the set of pairs of strings F = {(xs,ys) | S C {1,2,...,n}} is a 


fooling set for the language L°. 

(F1) By xs, the initial state 1 goes to the set S. The string ys is rejected by N from each state in S. It 
follows that the NFA N rejects the string xsys. Thus the string xsys is in LÀ. 

(F2) Let S Æ T. Then without loss of generality, there is a state i such that i € S andi ¢ T. By xs, the 
initial state 1 goes to S, so it also goes to the state i. Since i ¢ T, the string xr is accepted by N from i. 
Therefore, the NFA N accepts the string xsyr, and so this string is not in LÃ. 

Hence F is a fooling set for L° of size 2”. By Lemma[i] we have nsc(L‘) > 2”. O 


4 Intersection and Union 


In this section, we study the incomplete and nondeterministic state complexity of intersection and union 
of prefix-closed languages. If regular languages K and L are accepted by m-state and n-state NFAs, 
respectively, then the language KML is accepted by an NFA of at most mn states, and this bound is 
known to be tight in the binary case [10]. Our first result shows that the bound mn can be met by 
binary prefix-closed languages. Then, using this result, we get the same bound on the incomplete state 
complexity of intersection on prefix-closed languages. 


Theorem 7. Let K and L be prefix-closed languages over an alphabet X with nsc(K ) =m and nsc(L) = n. 

Then nsc(K ML) < mn, and the bound is tight if |X| > 2. 

Proof. The upper bound is the same as for regular languages [10]. For tightness, consider prefix-closed 

binary languages K = {w € {a,b}* | #a(w) <m—1} and L = {w € {a,b}* | #,(w) < n— 1} that are 

accepted by an m-state and an n-state incomplete DFAs A and B, respectively, shown in Figure B] 
Consider the set of pairs of strings F = {(a'b/,a™-!~‘b""!-/) |0 <i<m—1,0< j<n-—1} of size 

mn. Let us show that F is a fooling set for the language KM L. 


6-B---8-8 
0-5-6-8 


Figure 2: The incomplete DFAs A and B of prefix-closed languages K and L with nsc(K NL) = 
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(F1) The string a'b! -a’~!~'b"—!—J has exactly m — 1 a’s and n—1 b’s. It follows that it is in KA L. 

(F2) Let (i, j) 4 (k, £). If i < k, then the string a“b -a”—!~‘b"—!—J contains m — 1+ (k — i) a’s, and 
therefore it is not in K. The case of j < is symmetric. 

Hence F is a fooling set for K N L, and the theorem follows. O 


Theorem 8. Let K and L be prefix-closed languages over an alphabet X with isc(K) =m and isc(L) = n. 
Then isc(K A L) < mn, and the bound is tight if |X| > 2. 


Proof. Let A = (Qa,,6,4,54,Q,4) and B = (Qg, }Ł, ôB, Sg, Qg) be incomplete DFAs for K and L, respec- 
tively. Define an incomplete product automaton M = (Q4 x Qz,,6,(54,58),Qa X Qg), where 


Hana (ôa (p,a), ôs(q,a)), if both d4(p,a) and ôs(q,a) are defined, 
gi = 
pete undefined, otherwise. 


The DFA M accepts the language K ML. This gives the upper bound mn. For tightness, consider the same 
languages K and L as in the proof of the previous theorem. Notice that K and L are accepted by m-state 
and n-state incomplete DFAs, respectively. We have shown that nondeterministic state complexity of 
their intersection is mn. It follows that the incomplete state complexity is also at least mn. O 


Our next result on the incomplete state complexity of union on prefix-closed languages can be derived 
from the result on the quotient complexity of union in [5]. For the sake of completeness, we restate it in 
terms of incomplete complexities, and recall the proof. 


Theorem 9. Let K and L be prefix-closed languages over an alphabet X with isc(K) = m and isc(L) = n. 
Then isc(K UL) < mn +m +n, and the bound is tight if |X| > 2. 


Proof. Let A= ({0,1,...,m—1},2, 84,0, F4) and B= ({0,1,...,2—1},2, 63,0, Fg) be incomplete DFAs 
for the languages K and L, respectively. To construct a DFA for the language K UL, we first add the 
dead states m and n to the DFAs A and B, and let go all the undefined transitions to the dead states. 
Now we construct the classic product-automaton from the resulting complete DFAs with the state set 
{0,1,...,m} x {0,1,...,n}. All its states are final, except for the state (m,n) that is dead, and we do not 
count it. Hence we get the upper bound mn +m +n on the incomplete state complexity of union. 


Figure 3: The product automaton for incomplete DFAs A and B from Figure[2} m = 3 and n = 4. 
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. 


Figure 4: The NFAs A and B of prefix-closed languages K and L with nsc(K UL) =m+n+1. 


For tightness, we again consider the languages described in the proof of Theorem[7] We add the dead 
states m and n and construct the product automaton. The product automaton in the case of m = 3 and 
n = 4 is shown in Figure B] 

Each state (i, j) of the product automaton is reached from the initial state (0,0) by the string a'b’. Let 
(i, j) and (k, £) be two distinct states of the product automaton. If i < k, then the string a”~*b" is rejected 
from (k,) and accepted from (i, j). If j < £, then the string b’~‘a” is rejected from (k, £) and accepted 
from (i, j). Thus all the states in the product-automaton are reachable and pairwise distinguishable, and 
the lower bound mn + m +n follows. O 


In the nondeterministic case, the upper bound for union on regular language is m+n + 1, and it is 
tight in the binary case [10]. We get the same bound for union on prefix-closed languages, however, to 
define witness languages, we use a four-letter alphabet. 


Theorem 10. Let K and L be prefix-closed languages over an alphabet X with nsc(K ) = m and nsc(L) = n. 
Then nsc(K UL) <m-+n-+ 1, and the bound is tight if |X| > 4. 


Proof. The upper bound is the same as for regular languages [10]. To prove tightness, let K and L be the 
prefix-closed languages accepted by the NFAs A and B, respectively, shown in Figure|4] Let 


at ={(ai,a"-'-‘p) | i= 1,2,...,m—1}U{(a™'b,a)}, 
B= {(d, d) | j =1,2,...,.n—TU{(e"!d,0)}. 


Let us show that </ U #is a fooling set for the language K UL. 

(F1) We have a! -a”~!~'b = a~'b and c/ -c"~!~/d = ctd. Both these strings are in KU L. The 
strings a”~'b-aand c’~'d-c are in K U L as well. 

(F2) If 1 <i<i! <m-—1, then the string a'-a™—!-" is not in K since m— 1 (i —i) <m—1. Next, 
if 1 <i<m-—1, then a” 'b-a”~!~“b is not in K. The argumentation for two pairs from & is similar. 
If we concatenate the first part of a pair in < with the second part of a pair in Z, then we get a string 
that either contains all three symbols a,c,d, or contains both symbols a and d. No such string is in K UL. 

Thus æ U Z is a fooling set for the language K UL. Moreover, the sets Z U{(e,c)} and BU {(e,a)} 
are fooling sets for K UL as well. By Lemma[2] we have nsc(K UL) >m+n+1. O 


5 Concatenation 


In this section, we deal with the concatenation operation on prefix-closed languages. We start with 
incomplete state complexity. We use a slightly different ternary witness language than in [5], and prove 
the upper bound using automata constructions. 
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motores. 


Figure 5: The incomplete DFAs A and B of languages K and L with isc(KL) = m-2"”-! +2" —1. 


Theorem 11. Let m,n > 3. Let K and L be prefix-closed languages over an alphabet X with isc(K) =m 
and isc(L) =n. Then isc(KL) < m-2"~! +2" — 1, and the bound is tight if |Z| > 3. 


Proof. Let A = (Q4,2,64,54,Q,) and B = (Qz,¥,6g,5g,Q8) be incomplete DFAs with all states final 
accepting the languages K and L, respectively. Construct an NFA N for the language KL from the DFAs 
A and B by adding the transition on a symbol a from a state q in Q4 to the initial state sz of B whenever 
the transition on a in state q is defined in A. The initial states of the NFA N are s4 and sz, and the set of 
final states is Qg. Each reachable subset of the subset automaton of the NFA N contains at most one state 
of Q4, and several states of Qg. Moreover, if a state of Q4 is in a reachable subset S, then S must contain 
the state sg. This gives the upper bound m-2”~! +2” — 1 on isc(KL) since the empty set is not counted. 

For tightness, consider the prefix-closed languages K and L accepted by incomplete DFAs A and B, 
respectively, shown in Figure|5] in which the transitions are as follows: 

on a, state go goes to itself, and each state j goes to (j +1) mod n; 

on b, each state q; goes to state go, state 0 goes to itself, and state j with 1 < j <n—2 goes to j+1; 

on c, each state q; with O < i < m — 2 goes to qi+1, and each state j goes to itself; 
and all the remaining transitions are undefined. 

Construct an NFA N for the language KL as described above. Let us show that the subset automaton 
of the NFA N has m-2”~! +2” — 1 reachable and pairwise distinguishable non-empty subsets. 

(1) First, let us show that each set {go} US is reachable, where S C {0,1,...,n— 1} and 0 € S. The 
proof is by induction on the size of subsets. The set {go,0} is the initial subset. The set {q0,0, ji, j2,---, Jk} 
with 1 < jı < jo <-+++ < jy <n—1is reached from the set {q0,0, j2 —j1,-.-, jx — jı } by the string ab? ~t, 
and the latter set is reachable by induction. 

(2) Now, let us show that each set {q;} U S, is reachable, where 1 <i < m— 1, S C {0,1,...,n—1} 
and 0 € S. The set {g;} US is reached from {qo}US by c’, and the latter set is reachable as shown in (1). 

(3) Next, we show that each set S with S C {0,1,...,2—1} and 0 € S is reachable. The set S is 
reached from {qm-1 }U S by c, and the latter set is reachable as shown in case (2). 

(4) Finally, we show that each non-empty set S with S C {0,1,...,n— 1} and 0 ¢ S is reachable. If 
S = {ji, jo,..-, jk} with jı > 1, then S is reached from the set {0, j2 — j1,..-, jk — ji} by a”, and the 
latter set is reachable as shown in case (3). 

This proves the reachability of m-2”~! +2” — 1 non-empty subsets. 

To prove distinguishability, notice that the string b” is accepted by the DFA B only from the state 0, 
and the string a"~'!~‘ab” is accepted only from the state i (1 < i < n— 1). If S and T are two distinct 
subsets of {0,1,...,2—1}, then S and T differ in a state i. If i = 0, then b” distinguishes S and T, and if 
i> 1, then a”~‘b” distinguishes S and T. 
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' Q 
0-04- ORG) 
= m 
3 D-O- OLE) 


Figure 6: The incomplete DFAs of prefix-closed languages K and L with nsc(KL) = m +n. 


Next, the sets {q;}U S and {g;}UT, where S and T are distinct subsets of {0,1,...,n — 1}, go to 
S and T, respectively, by c”. Since S and T are distinguishable, the sets {q;} U S and {g;}UT are 
distinguishable as well. 

Finally, notice that the string b”ab” is accepted by the NFA N from each state q;, but rejected from 
each state jin {0,1,...,2—1}. Hence the sets {g;}US and T, where S and T are subsets of {0,...,n— 1}, 
are distinguishable. Now let 0 <i < j <m-— 1. Then {q;}US and {g;}UT go to {gi+m—j}US and T, 
respectively, by c”~/. Since {qi+m—j;}US and T are distinguishable, the sets {q;} US and {q;}UT are 
distinguishable as well. This proves the distinguishability of all the reachable subsets, and completes the 
proof. O 


In the next theorem, we consider the nondeterministic case. For regular languages, the upper bound 
on the nondeterministic state complexity of concatenation is m +n, and it is tight in the binary case 
[10]. For prefix-closed languages, we get the same bound for concatenation. However, we define witness 
languages over a ternary alphabet. 

Theorem 12. Let m,n > 3. Let K and L be prefix-closed languages over an alphabet È with nsc(K) = m 
and nsc(L) =n. Then nsc(KL) < m+n, and the bound is tight if |X| > 3. 


Proof. The upper bound is the same as for regular languages [10]. For tightness, consider the ternary 

prefix-closed languages K and L accepted by incomplete DFAs A and B, respectively, shown in Figure [6] 

Notice that if a string w is in KL, then w is in the language b*a*c*b*a*c*, and the number of a’s in w is 

at most (n+ m— 2). 

For i=0,1,...,m-+n—1, define the pair (x;,y;) as follows: 
(xi yi) = (a’,a"-' cha"), fori=0,1,...m—1, 
(Xm-+js¥m+j) = (a" ‘cbal,a" 4), for j=0,1,...n—1. 

Let us show that the set of pairs F = {(x;,y;) |i=0,1,...,m-+n-—1} is a fooling set for the language KL. 
(F1) For each i, we have x;y; = a”—'cba"—'. Thus xiyi is in KL since a”—'cisin K and ba"! is in L. 
(F2) Let i < j and (i,j) 4 (m—1,m). Then the number of a’s in the string x;y; is greater than 

m-+n-—2, and therefore the string x;y; is not in KL. If (i, j) = (m—1,m), then xnYm—1 = a"! cbeba™!. 

Thus Xmym-1 is not in b*a*c*b*a*c*, and therefore it is not in KL. 

Hence the set ¥ is a fooling set for the language KL, so nsc(KL) > m+n. O 


6 Star and Reversal 


We conclude our paper with the star and reversal operation on prefix-closed languages. The star of a 
language L is the language L* = U;>o LÍ, where L° = {£} and L+! = Li - L. 
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Figure 7: The incomplete DFA A of a prefix-closed language L with isc(L*) = 2"7!; n = 6. 


If a regular language L is accepted by a complete n-state DFA, then the language L* is accepted by a 
DFA of at most 3/4- 2” states, and the bound is tight in the binary case [20] [25]. 

For prefix-closed languages, the upper bound on the quotient complexity for star is 2”7? + 1, and it 
has been shown to be tight in the ternary case [5]. In the case of incomplete state complexity, we get the 
bound 2”—!. For the sake of completeness, we give a simple proof of the upper bound using automata 
constructions. Moreover, we are able to define a witness language over a binary alphabet. 


Theorem 13. Let n > 4. Let L be a prefix-closed regular language over an alphabet X with isc(L) = n. 
Then isc(L*) < 2"”7!, and the bound is tight if \Z| > 2. 


Proof. Let A = (Q,z,-,5,Q) be an incomplete DFA for L. Construct an NFA A* for L* from the DFA 
A by adding the transition on a symbol a from a state q to the initial state s whenever the transition q -a 
is defined. In the subset automaton of the NFA A*, each reachable set is either empty, or it contains the 
initial state s. It follows that isc(L*) < 2”7!. 

For tightness, consider the binary incomplete DFA with the state set {1,2,...,n}, the initial state 
1 and with all states final. The transitions are as follows. By a, the transitions in states 1 and 2 are 
undefined, each odd state i with 3 < i < n — 1 goes to i+ 1, and each even state i with 3 <i <n—1 goes 
to i—1. By b, there is a cycle (1,2,3), each odd state i with 4 < i < n — 1 goes to i — 1, and each even 
state i with 4 < i < n — 1 goes toi+1. If n is odd, then n goes to itself by a, otherwise it goes to itself 
by b. The DFA for n = 6 is shown in Figure [7] 

Notice that each state i with 3 < i < n has exactly one in-transition on a and on b. Denote by a7! (i) 
the state that goes to i on a, and by b7! (i) the state that goes to i on b. 

Construct an NFA A* as described above. Let us show that in the subset automaton of the NFA A*, 
all subsets of {1,2,...,n} containing state 1 are reachable and pairwise distinguishable. 

We prove reachability by induction on the size of subsets. The basis is |S| = 1, and the set {1} is 
reachable since it is the initial state of the subset automaton. Assume that every set S containing 1 with 
|S| =k, where 1 < k <n—1, is reachable. Let S = {1,i1,i2,i3,...,ix}, where 2 <i) <i <- <i <a, 
be a set of size k + 1. Consider three cases: 


(i) i; =2. Take S' = {1,b7! (iz), b7!(i3),...,b7! (ig) }. Then |S | =k, and therefore S’ is reachable by 
the induction hypothesis. Since we have S’ A {1,2,i2,...,ig} = S, the set S is reachable. 

(ii) ij) =3. Take S' = {1,2,b7! (i2),b™! (i3),...,b7!(ip)}. Then |S’| =k +1 and S’ contains states 1 and 
2. Therefore, the set S’ is reachable as shown in case (i). Since we have v2 {1,2,3,i2,i3,...,i¢} S 
{1,3,i2,i3,...,i,} = S, the set S is reachable. 


(iii) Let i; = j > 3, and assume that each set {1, j,i2,...,ig} is reachable. Let us show that then also 
each set {1,7 +1,i2,...,i,} is reachable. If j is odd, then the set {1,7 + 1,i2,...,i,} is reached 
from the set {1, j,a~'(i2),a~'(i3),...,a | (iz) } by a. If j is even, then the set {1, j+1,i2,...,ik} 
is reached from the set {1, j,b™! (i2),b™! (i3), ...,b7! (ix) } by baa. 


K. Cevorova, G. Jirásková, P. Mlynarcik, M. Palmovsky, J. Šebej 211 


This proves reachability. To prove distinguishability, notice that the string (ab)"~? is accepted by the 
NFA A* from state 3 since state 3 goes to the initial state 1 by (ab)"~ through the computation 


b u ab » ab b b b b b , ab 
8 57 Se Sn nS 1 3 Sd 


if n is odd, and through a similar computation if n is even. On the other hand, the string (ab)? cannot 
be read from any other state 2i with 2 < i < n/2 since we have 


2i A 0A OF 0 A A i A OS 13,1} og, 


thus 2i goes to the empty set by (ab)', so also by (ab)"~?. If n is odd, then we have 


ol os 10) opal eS Se Se 1 SS 


{n—3,1,2} % -.. % AE ET 


thus 2i+ 1 goes to the empty set by (ab)”~', i > 2, and so also by (ab)"~*. For n even, the argument 
is similar. The string (ab)"~7 is not accepted from states 1 and 2. Hence the NFA A* accepts the string 
(ab)"~ only from the state 3. Since there is exactly one in-transition on b in state 3, and it goes from state 
2, the string b(ab)"~? is accepted by A* only from state 2. Similarly, the string bb(ab)"~? is accepted by 
A* only from state 1. Next, for similar reasons, the string a(ab)"~? is accepted only from 4, the string 
ba(ab)"~* is accepted only from 5, and in the general case, the string (ab)'a(ab)"~? is accepted only 
from 4 +2i (i > 0), and the string (ba)! (ab)"~? is accepted only from 3 +2i (i > 1). Hence for each state 
q of the NFA A’, there exists a string w, that is accepted by A* only from the state q. It follows that all 
the subsets of the subset automaton of the NFA A* are pairwise distinguishable since two distinct subsets 
differ in a state q, and the string w, distinguishes the two subsets. This completes the proof. O 


We did some computations in the binary case. Having the files of n-state minimal binary pairwise 
non-isomorphic complete DFAs with a dead state and all the remaining states final, we computed the 
state complexity of the star of languages accepted by DFAs on the lists; here the state complexity of 
a regular language L, sc(L), is defined as the smallest number of states in any complete DFA for the 
language L. We computed the frequencies of the resulting complexities, and the average complexity of 
star. Our results are summarized in Table B] Notice that for n = 3,4,5, there is just one language with 
sc(L) =n and sc(L*) = 2. Let us show that this holds for every n with n > 3. 


mstn| 1 | 2] 3 | 4 | 5s | 6 | 7 | 8 | 9 | average | 


2 


| 2 |-{[2{-]-}|-]-|]- a | 
[cae ee ie Me eee ae 


| 4 | to | 1 | 4s | 30 | 6 | -| -| - | - |17 
| 5 |ar| 1 |m | 275 | 350 | sa | 84 | - | 26 | 1849 | 


Table 1: The frequencies of the complexities and the average complexity of star on prefix-closed lan- 
guages in the binary case; n = 2,3,4,5. 
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Figure 8: The only binary n-state complete DFA of a prefix-closed language L with sc(L*) = 2. 


Proposition 14. Let n > 3. There exists exactly one (up to renaming of alphabet symbols) binary prefix- 
closed regular language L with sc(L) = nand sc(L*) =2. 


Proof. Let A = ({0,1}, {a,b}, 6,0, F ) be a minimal two-state DFA for the language L*. Since L is prefix- 
closed, the language L* is prefix-closed as well. It follows that state O is final, and state 1 is dead, thus 
F = {0} and 6(1,a) = 6(1,b) = 1. 

Without loss of generality, state 1 is reached from the initial state 0 by a, thus 6(0,a) = 1. 

Since n > 3, the language L contains a non-empty string. This means that the language L* contains a 
non-empty string as well. Therefore, we must have 6(0,b) = 0, and so L* = b*. 

Now let B be the minimal n-state DFA for L. Then all the states of B are final, except for the dead 
state. Since L* = b*, no a may occur in any string of L. Hence each non-dead state of B must go to 
the dead state on a. Since all states must be reachable, we must have a path labeled by b”? and going 
through all the final states. The last final state must go to the dead state on b because otherwise all final 
states would be equivalent. The resulting n-state DFA B is shown in Figure[8] O 


The reverse w? of a string w is defined by £? = g, and (wa)? = aw? for a in E and w in £*. The 
reverse of a language L is the language L? = {w* | w € L}. If a regular language L is accepted by a 
complete n-state DFA, then the language L* is accepted by a complete DFA of at most 2” states [22] 25], 
and the bound is tight in the binary case [16] [19]. 

For prefix-closed languages, the quotient complexity of reversal is 2”~! [5], and it follows from the 
results on ideal languages since reversal commutes with complementation, and the complement of a 
prefix-closed language is a right ideal; here a language L is a right ideal if L = L- X*. 

We restate the result for reversal in terms of incomplete state complexity, and prove tightness using 
a slightly different witness language. 


Theorem 15. Letn > 2. Let L be a prefix-closed regular language over an alphabet X with isc(L) = n. 
Then isc(L®) < 2” — 1, and the bound is tight if |Z| > 2. 


Proof. Let A be an incomplete DFA for L. Construct an NFA A? for the language LË from the DFA A by 
swapping the role of the initial and final states, and by reversing all the transitions. The subset automaton 
of the NFA A? has at most 2” — 1 non-empty reachable states, and the upper bound follows. 

For tightness, consider the incomplete DFA A with all states final, shown in Figure [9] Construct an 
NFA A” as described above. In the subset automaton of the NFA AF, the initial state is {1,2,...,n}. If S 
is a subset and if i € S, then the subset S \ {i} is reached from S by a‘ba’~'. This proves the reachability 
of all non-empty subset by odd induction. Since the states of the subset automaton of any reversed DFA 
are pairwise distinguishable [[7|{16}{22]], the theorem follows. 

O 


Now, let us turn to the nondeterministic case. For regular languages, the tight bound for both star and 
reversal is n+ 1. It is met by a unary language for star [10], and by a binary language for reversal [12]. 
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Figure 9: The incomplete DFA A of a language L with isc(L®) = 2” — 1, and the NFA A”. 


For prefix-closed languages, we get the same bound for reversal. However, for star, the upper bound 
is n since every prefix-closed language contains the empty string, and there is no need to add a new initial 
state in the construction of an NFA for star. In the following theorem, we show that both these bounds 
are tight in the binary case. 


Theorem 16. Letn > 2. Let L be a prefix-closed language over an alphabet X with nsc(L) =n. Then 
(1) nsc(L*) <n, 
(2) nsc(LE) <n+1, 

and both bounds are tight if |X| > 2. 


Proof. (1) Let N = (Q,2,6,s,F) be an n-state NFA for L. Since L is prefix-closed, the empty string is 
in L. Therefore, we can get an n-state NFA for the language L* from the NFA N as follows: for each 
state q and each symbol a such that 6(q,a) NF #9, we add a transition on a from q to the initial state s. 
Thus the upper bound is n. 

For tightness, consider the prefix-closed language L accepted by the NFA A shown in Figure 
Consider the set of pair of strings F = {(a',a”~!~‘b) | i =0,1,...,n— 1} of size n. Let us show that F 
is a fooling set for the language L*. 

(F1) We have a‘a”~!~'b = a"~'b. Since the string a"~'b is in L, it also is in L*. 

(F2) Let i < j. Then a'a”"™!-ib = a""!--9p, Since no string afb with £ <n —1 is in L, the string 
a"—!-G-Dpb is not in L*. 

Hence the set F is a fooling set for the language L*, and the lower bound follows. 

(2) The upper bound is the same as for regular languages [10]. It is shown in Theorem 2] that 
this bound is met by the binary prefix-closed language L accepted by the NFA shown in Figure 
The proof in is by a counting argument. Notice that Lemma [2] is satisfied for the language L 
with Z = { (ba, a"i) |i =0,1,...,n 2}, = {(ba" |, ba"!)},u = ba"™!, and v =a. This gives 
nsc(L*) > n+ 1 immediately. O 


-) O TOLG) 


Figure 10: The NFA of a prefix-closed language L with nsc(L*) = n and nsc(L*) =n+ 1. 
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i on prefix-closed 


isc 
sc on prefix-closed 
sc on regular 


nsc on prefix-closed 2” 


nsc on prefix-closed 
nsc on regular 


Table 3: The complexity of concatenation, star, and reversal on prefix-closed and regular languages. 


7 Conclusions 


In this paper we considered operations on languages recognized by incomplete deterministic or non- 
deterministic finite automata with all states final. Our results are summarized in Tables P] and B] The 
results on quotient (state) complexity on prefix-closed languages are from [5], and the results for regular 
languages are from [12] [20} [25]. Notice that in the nondeterministic case, our results are the same 
as in the general case of regular languages, except for the star operation. However, to prove tightness, 
we usually used larger alphabets than in the general case. Whether or not these bounds are tight also for 
smaller alphabets remains open. 
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